A non-parametric k-nearest neighbour entropy estimator

نویسندگان

  • Damiano Lombardi
  • Sanjay Pant
چکیده

A non-parametric k-nearest neighbour based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering non-uniform probability densities in the region of k-nearest neighbours around each sample point. It aims at improving the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-functional relationships leading to high correlation between components of the random variable are present; and third, when the marginal variances of random variable components vary significantly with respect to each other. Heuristics on the error of the proposed and classical estimators are presented. Finally, the proposed estimator is tested for a variety of distributions in successively increasing dimensions and in the presence of a near-functional relationship. Its performance is compared with a classical estimator and shown to be a significant improvement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data

Kernel density estimators are the basic tools for density estimation in non-parametric statistics.  The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in  which  the  bandwidth  is varied depending on the location of the sample points. In this paper‎, we  initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...

متن کامل

Efficient multivariate entropy estimation via k-nearest neighbour distances

Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this paper, we seek entropy estimators that are efficient in the sense of achieving the local asymptotic minimax lower bound. To this end, we initially study a generalisation of the estimator originally proposed by Ko...

متن کامل

A Nearest-Neighbour Approach to Estimation of Entropies

The concept of Shannon entropy as a measure of disorder is introduced and the generalisations of the Rényi and Tsallis entropy are motivated and defined. A number of different estimators for Shannon, Rényi and Tsallis entropy are defined in the theoretical part and compared by simulation in the practical part. In this work the nearest neighbour estimator presented in Leonenko and Pronzato (2010...

متن کامل

Estimating Individual Tree Growth with the k-Nearest Neighbour and k-Most Similar Neighbour Methods

The purpose of this study was to examine the use of non-parametric methods in estimating tree level growth models. In non-parametric methods the growth of a tree is predicted as a weighted average of the values of neighbouring observations. The selection of the nearest neighbours is based on the differences between tree and stand level characteristics of the target tree and the neighbours. The ...

متن کامل

Generalized K-Nearest Neighbour Algorithm- A Predicting Tool

k-nearest neighbour algorithm is a non-parametric machine learning algorithm generally used for classification. It is also known as instance based learning or lazy learning. K-NN algorithm can also be adapted for regression that is for estimating continuous variables. In this research paper the researcher endow with a generalized K-nearest algorithm used for predicting a continuous value. In or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1506.06501  شماره 

صفحات  -

تاریخ انتشار 2015